Data structures for statistical computing in Python

نویسنده

  • Wes McKinney
چکیده

In this paper we are concerned with the practical issues of working with data sets common to finance, statistics, and other related fields. pandas is a new library which aims to facilitate working with these data sets and to provide a set of fundamental building blocks for implementing statistical models. We will discuss specific design issues encountered in the course of developing pandas with relevant examples and some comparisons with the R language. We conclude by discussing possible future directions for statistical computing and data analysis using Python.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Common Spatial Patterns Feature Extraction and Support Vector Machine Classification for Motor Imagery with the SecondBrain

Recently, a large set of electroencephalography (EEG) data is being generated by several high-quality labs worldwide and is free to be used by all researchers in the world. On the other hand, many neuroscience researchers need these data to study different neural disorders for better diagnosis and evaluating the treatment. However, some format adaptation and pre-processing are necessary before ...

متن کامل

gpustats: GPU Library for Statistical Computing in Python

In this talk we will discuss gpustats, a new Python library for assisting in “big data” statistical computing applications, particularly Monte Carlobased inference algorithms. The library provides a general code generation / metaprogramming framework for easily implementing discrete and continuous probability density functions and random variable samplers. These functions can be utilized to ach...

متن کامل

Estimating scour below inverted siphon structures using stochastic and soft computing approaches

This paper uses nonlinear regression, Artificial Neural Network (ANN) and Genetic Programming (GP) approaches for predicting an important tangible issue i.e. scours dimensions downstream of inverted siphon structures. Dimensional analysis and nonlinear regression-based equations was proposed for estimation of maximum scour depth, location of the scour hole, location and height of the dune downs...

متن کامل

Synergies in scientific computing by combining multi-paradigmatic languages for high-performance applications

The challenging art of multi-paradigmatic application development, which only few languages currently support, greatly aids the development of highly efficient and reusable software components. A link of two such languages, namely Python and Cþþ , is presented. Thereby data structures and algorithms realised in Cþþ using features such as compile-time meta-programming are made available to the r...

متن کامل

MathChem: A Python Package For Calculating Topological Indices

We introduce MathChem, an open-source and cross-platform Python package, aimed at supporting research in mathematical chemistry. MathChem enables researchers to load batches of molecules or molecular graphs from external files or NCI online database, calculate topological indices, perform statistical analyses and visualize the results. As a Python package, MathChem is easily integrable with Sag...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010